591 research outputs found

    Data Deluge in Astrophysics: Photometric Redshifts as a Template Use Case

    Get PDF
    Astronomy has entered the big data era and Machine Learning based methods have found widespread use in a large variety of astronomical applications. This is demonstrated by the recent huge increase in the number of publications making use of this new approach. The usage of machine learning methods, however is still far from trivial and many problems still need to be solved. Using the evaluation of photometric redshifts as a case study, we outline the main problems and some ongoing efforts to solve them.Comment: 13 pages, 3 figures, Springer's Communications in Computer and Information Science (CCIS), Vol. 82

    Data Driven Discovery in Astrophysics

    Get PDF
    We review some aspects of the current state of data-intensive astronomy, its methods, and some outstanding data analysis challenges. Astronomy is at the forefront of "big data" science, with exponentially growing data volumes and data rates, and an ever-increasing complexity, now entering the Petascale regime. Telescopes and observatories from both ground and space, covering a full range of wavelengths, feed the data via processing pipelines into dedicated archives, where they can be accessed for scientific analysis. Most of the large archives are connected through the Virtual Observatory framework, that provides interoperability standards and services, and effectively constitutes a global data grid of astronomy. Making discoveries in this overabundance of data requires applications of novel, machine learning tools. We describe some of the recent examples of such applications.Comment: Keynote talk in the proceedings of ESA-ESRIN Conference: Big Data from Space 2014, Frascati, Italy, November 12-14, 2014, 8 pages, 2 figure

    Photometric redshifts for Quasars in multi band Surveys

    Get PDF
    MLPQNA stands for Multi Layer Perceptron with Quasi Newton Algorithm and it is a machine learning method which can be used to cope with regression and classification problems on complex and massive data sets. In this paper we give the formal description of the method and present the results of its application to the evaluation of photometric redshifts for quasars. The data set used for the experiment was obtained by merging four different surveys (SDSS, GALEX, UKIDSS and WISE), thus covering a wide range of wavelengths from the UV to the mid-infrared. The method is able i) to achieve a very high accuracy; ii) to drastically reduce the number of outliers and catastrophic objects; iii) to discriminate among parameters (or features) on the basis of their significance, so that the number of features used for training and analysis can be optimized in order to reduce both the computational demands and the effects of degeneracy. The best experiment, which makes use of a selected combination of parameters drawn from the four surveys, leads, in terms of DeltaZnorm (i.e. (zspec-zphot)/(1+zspec)), to an average of DeltaZnorm = 0.004, a standard deviation sigma = 0.069 and a Median Absolute Deviation MAD = 0.02 over the whole redshift range (i.e. zspec <= 3.6), defined by the 4-survey cross-matched spectroscopic sample. The fraction of catastrophic outliers, i.e. of objects with photo-z deviating more than 2sigma from the spectroscopic value is < 3%, leading to a sigma = 0.035 after their removal, over the same redshift range. The method is made available to the community through the DAMEWARE web application.Comment: 38 pages, Submitted to ApJ in February 2013; Accepted by ApJ in May 201

    Star Formation Rates for photometric samples of galaxies using machine learning methods

    Full text link
    Star Formation Rates or SFRs are crucial to constrain theories of galaxy formation and evolution. SFRs are usually estimated via spectroscopic observations requiring large amounts of telescope time. We explore an alternative approach based on the photometric estimation of global SFRs for large samples of galaxies, by using methods such as automatic parameter space optimisation, and supervised Machine Learning models. We demonstrate that, with such approach, accurate multi-band photometry allows to estimate reliable SFRs. We also investigate how the use of photometric rather than spectroscopic redshifts, affects the accuracy of derived global SFRs. Finally, we provide a publicly available catalogue of SFRs for more than 27 million galaxies extracted from the Sloan Digital Sky survey Data Release 7. The catalogue is available through the Vizier facility at the following link ftp://cdsarc.u-strasbg.fr/pub/cats/J/MNRAS/486/1377

    AIDA, a Modular Web Application for Astronomical Data Analysis and Instrument Monitoring Services

    Get PDF
    In the last decade, Astronomy has been the scene of the realization of panchromatic surveys, with sophisticated instruments acquiring a huge quantity of exceptional quality data. This poses the need to integrate advanced data-driven science methodologies for the automatic exploration of huge data archives, and the need for efficient short- and long-term monitoring and diagnostics systems. The goal is to keep the quality of the observations under control and to detect and circumscribe anomalies and malfunctions, facilitating rapid and effective corrections, ensuring correct maintenance of all components and the good health of scientific data over time. In particular, this requirement is crucial for space-borne observation systems, both in logistical and economic terms. AIDA (Advanced Infrastructure for Data Analysis) is a portable and modular web application, designed to provide an efficient and intuitive software infrastructure to support monitoring of data acquiring systems over time, diagnostics and both scientific and engineering data quality analysis, particularly suited for astronomical instruments. Given its modular system prerogative, it is possible to extend its functionalities, by integrating and customizing monitoring and diagnostics systems, as well as scientific data analysis solutions, including machine/deep learning and data mining techniques and methods. A specialized version of AIDA has been recently appointed as focal plane instrument operation diagnostics, analytics and monitoring service within the Science Ground Segment of the Euclid space mission

    Astrophysics in S.Co.P.E

    Get PDF
    S.Co.P.E. is one of the four projects funded by the Italian Government in order to provide Southern Italy with a distributed computing infrastructure for fundamental science. Beside being aimed at building the infrastructure, S.Co.P.E. is also actively pursuing research in several areas among which astrophysics and observational cosmology. We shortly summarize the most significant results obtained in the first two years of the project and related to the development of middleware and Data Mining tools for the Virtual Observatory

    SDSS-DR9 photometric redshifts

    Get PDF
    Accurate photometric redshifts for large samples of galaxies are among the main products of modern multiband digital surveys. Over the last decade, the Sloan Digital Sky Survey (SDSS) has become a sort of benchmark against which to test the various methods. We present an application of a new method to the estimation of photometric redshifts for the galaxies in the SDSS Data Release 9 (SDSS-DR9). Photometric redshifts for more than 143 million galaxies were produced. The MLPQNA (Multi Layer Perceptron with Quasi Newton Algorithm) model provided within the framework of the DAMEWARE (DAta Mining and Exploration Web Application REsource) is an interpolative method derived from machine learning models. The obtained redshifts have an overall uncertainty of sigma=0.023 with a very small average bias of about 3x10^-5, and a fraction of catastrophic outliers of about 5%. This result is slightly better than what was already available in the literature, particularly in terms of the smaller fraction of catastrophic outliers

    Data Driven Discovery in Astrophysics

    Get PDF
    We review some aspects of the current state of data-intensive astronomy, its methods, and some outstanding data analysis challenges. Astronomy is at the forefront of "big data" science, with exponentially growing data volumes and data rates, and an ever-increasing complexity, now entering the Petascale regime. Telescopes and observatories from both ground and space, covering a full range of wavelengths, feed the data via processing pipelines into dedicated archives, where they can be accessed for scientific analysis. Most of the large archives are connected through the Virtual Observatory framework, that provides interoperability standards and services, and effectively constitutes a global data grid of astronomy. Making discoveries in this overabundance of data requires applications of novel, machine learning tools. We describe some of the recent examples of such applications

    A catalogue of photometric redshifts for the SDSS-DR9 galaxies

    Get PDF
    Accurate photometric redshifts for large samples of galaxies are among the main products of modern multiband digital surveys. Over the last decade, the Sloan Digital Sky Survey (SDSS) has become a sort of benchmark against which to test the various methods. We present an application of a new method to the estimation of photometric redshifts for the galaxies in the SDSS Data Release 9 (SDSS-DR9). Photometric redshifts for more than 143 million galaxies were produced and made available at the URL: http://dame.dsf.unina.it/catalog/DR9PHOTOZ/. The MLPQNA (Multi Layer Perceptron with Quasi Newton Algorithm) model provided within the framework of the DAMEWARE (DAta Mining and Exploration Web Application REsource) is an interpolative method derived from machine learning models. The obtained redshifts have an overall uncertainty of sigma=0.023 with a very small average bias of about 3x10^-5, and a fraction of catastrophic outliers of about5%. This result is slightly better than what was already available in the literature, also in terms of the smaller fraction of catastrophic outliers
    • …
    corecore